A data-driven missing value imputation approach for longitudinal datasets
نویسندگان
چکیده
Abstract Longitudinal datasets of human ageing studies usually have a high volume missing data, and one way to handle values in dataset is replace them with estimations. However, there are many methods estimate values, no single method the best for all datasets. In this article, we propose data-driven value imputation approach that performs feature-wise selection method, using known information rank five selected, based on their estimation error rates. We evaluated proposed two sets experiments: classifier-independent scenario, where compared applicabilities rates each method; classifier-dependent predictive accuracy Random Forest classifiers generated prepared baseline doing (letting classification algorithm internally). Based our results from both experiments, concluded generally resulted models more accurate estimations data better performing classifiers, longitudinal ageing. also observed devised specifically had very This reinforces idea temporal intrinsic worthwhile endeavour machine learning applications, can be achieved through approach.
منابع مشابه
Missing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملA Structured Prediction Approach for Missing Value Imputation
Missing value imputation is an important practical problem. There is a large body of work on it, but there does not exist any work that formulates the problem in a structured output setting. Also, most applications have constraints on the imputed data, for example on the distribution associated with each variable. None of the existing imputation methods use these constraints. In this paper we p...
متن کاملMissing Value Imputation Based on Data Clustering
We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clustering-based Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to t...
متن کاملBIOINFORMATICS Collateral Missing Value Imputation: A New Robust Missing Value Estimation Algorithm For Microarray Data
Motivation: Microarray data is used in a range of application areas in biology, though often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible prior to using these algorithms. While many imputation algo...
متن کاملCollateral missing value imputation: a new robust missing value estimation algorithm for microarray data
MOTIVATION Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Artificial Intelligence Review
سال: 2021
ISSN: ['0269-2821', '1573-7462']
DOI: https://doi.org/10.1007/s10462-021-09963-5